-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some cleaning #37
some cleaning #37
Conversation
…tor128.shuffle (DO NOT USE)
@Nick-Nuon I solved the performance problem with ARM/NEON. Lesson learned: do not use Vector128.Shuffle for anything. |
@Nick-Nuon Note that our routines are likely now buggy, but I am hoping it is easily fixable. I recommend keeping the benchmarks alive so that we no longer regress. The current speed is good and we should aim to keep it. |
--------- Co-authored-by: Daniel Lemire <[email protected]>
@Nick-Nuon My last commit f037b96 simplifies the logic of the Avx2 validator and should fix the tests. Can you review? |
@Nick-Nuon The failures on Arm64 are expected. |
I'll work on the Arm64 bit this morning. :-) |
Im on it! |
@Nick-Nuon I expect to have the arm component in a few minutes. |
@Nick-Nuon I have pushed a commit. Arm should be good now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks kosher.
There were only 2 minor typos in comments so I took the liberty to change them directly which triggered the tests again. I'll merge as soon as the tests pass.
@Nick-Nuon Great stuff. I recommend moving iteratively from this point forward. We have a good basis. I trust these tests. We just need to add Sse and AVX-512... but it is somewhat easier than the work done so far. :-) |
Merging! Yeah, I think so too. I really don't regret the time we spent on the the tests. SSE in particular should be straightforward, given how close it is too avx-2. |
This PR aims to clean the code a bit, recover good performance and fix the Avx2 tests.